Fraud Detection by Stacking Cost-Sensitive Decision Trees
نویسندگان
چکیده
Worldwide, billions of euros are lost every year due to credit card fraud. Increasingly, fraud has diversified to different digital channels, including mobile and online payments, creating new challenges as innovative new fraud patterns emerge. Hence, it remains challenging to find effective methods of mitigating fraud. Existing solutions include simple if-then rules and classical machine learning algorithms. Credit card fraud is by definition an example-dependent and cost-sensitive classification problem, in which the costs due to misclassification vary between examples and not only within classes, i.e., misclassifying a fraudulent transaction may have a financial impact ranging from a few to thousands of euros. In this paper, we propose an extension to the cost-sensitive decision trees algorithm, by creating an ensemble of such trees, and combining them using a stacking approach with a cost-sensitive logistic regression. We compare our method with standard machine learning algorithms and state-of-the-art cost-sensitive classification methods using a real credit card fraud dataset provided by a large European card processing company. The results show that our method achieves savings of up to 73.3%, more than 2 percentage points more than a single cost-sensitive decision tree.
منابع مشابه
Ensemble Classification and Extended Feature Selection for Credit Card Fraud Detection
Due to the rise of technology, the possibility of fraud in different areas such as banking has been increased. Credit card fraud is a crucial problem in banking and its danger is over increasing. This paper proposes an advanced data mining method, considering both feature selection and decision cost for accuracy enhancement of credit card fraud detection. After selecting the best and most effec...
متن کاملEnsemble of Example-Dependent Cost-Sensitive Decision Trees
Several real-world classification problems are example-dependent cost-sensitive in nature, where the costs due to misclassification vary between examples and not only within classes. However, standard classification methods do not take these costs into account, and assume a constant cost of misclassification errors. In previous works, some methods that take into account the financial costs into...
متن کاملCredit Card Fraud Detection using Data mining and Statistical Methods
Due to today’s advancement in technology and businesses, fraud detection has become a critical component of financial transactions. Considering vast amounts of data in large datasets, it becomes more difficult to detect fraud transactions manually. In this research, we propose a combined method using both data mining and statistical tasks, utilizing feature selection, resampling and cost-...
متن کاملA cost-sensitive decision tree approach for fraud detection
With the developments in the information technology, fraud is spreading all over the world, resulting in huge financial losses. Though fraud prevention mechanisms such as CHIP&PIN are developed for credit card systems, these mechanisms do not prevent the most common fraud types such as fraudulent credit card usages over virtual POS (Point Of Sale) terminals or mail orders so called online credi...
متن کاملExample-dependent cost-sensitive decision trees
Several real-world classification problems are example-dependent cost-sensitive in nature, where the costs due to misclassification vary between examples. However, standard classification methods do not take these costs into account, and assume a constant cost of misclassification errors. State-of-the-art example-dependent cost-sensitive techniques only introduce the cost to the algorithm, eith...
متن کامل